AITopics | linear transformation

Collaborating Authors

linear transformation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

93fce71def4e3cf418918805455d436f-Paper-Conference.pdf

Neural Information Processing SystemsJun-19-2026, 21:44:39 GMT

When and why representations learned by different deep neural networks are similar is an active research topic. We choose to address these questions from the perspective of identifiability theory, which suggests that a measure of representational similarity should be invariant to transformations that leave the model distribution unchanged. Focusing on a model family which includes several popular pre-training approaches, e.g., autoregressive language models, we explore when models which generate distributions that are close have similar representations. We prove that a small Kullback-Leibler divergence between the model distributions does not guarantee that the corresponding representations are similar. This has the important corollary that models with near-maximum data likelihood can still learn dissimilar representations--a phenomenon mirrored in our experiments with models trained on CIFAR-10. We then define a distributional distance for which closeness implies representational similarity, and in synthetic experiments, we find that wider networks learn distributions which are closer with respect to our distance and have more similar representations. Our results thus clarify the link between closeness in distribution and representational similarity.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Europe (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization

Neural Information Processing SystemsJun-15-2026, 10:07:24 GMT

We introduce ReplaceMe, a generalized training-free depth pruning method that effectively replaces transformer blocks with a linear operation, while maintaining high performance for low compression ratios. In contrast to conventional pruning approaches that require additional training or fine-tuning, our approach requires only a small calibration dataset that is used to estimate a linear transformation, which approximates the pruned blocks. The estimated linear mapping can be seamlessly merged with the remaining transformer blocks, eliminating the need for any additional network parameters. Our experiments show that ReplaceMe consistently outperforms other training-free approaches and remains highly competitive with state-of-the-art pruning methods that involve extensive retraining/fine-tuning and architectural modifications. Applied to several large language models (LLMs), ReplaceMe achieves up to 25% pruning while retaining approximately 90% of the original model's performance on open benchmarks--without any training or healing steps, resulting in minimal computational overhead.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry:

Education (0.67)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

d94b46ec30adee2bbb134f813fc9dde0-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-29-2026, 23:20:17 GMT

artificial intelligence, machine learning, representation, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Forster Decomposition and Learning Halfspaces with Noise

Neural Information Processing SystemsApr-25-2026, 14:24:29 GMT

AForster transform is an operation that turns a distribution into one with good anticoncentration properties. While a Forster transform does not always exist, we show that any distribution can be efficiently decomposed as a disjoint mixture of few distributions for which a Forster transform exists and can be computed efficiently. As the main application of this result, we obtain the first polynomial-time algorithm for distribution-independent PAC learning of halfspaces in the Massart noise model with strongly polynomial sample complexity, i.e., independent of the bit complexity of the examples. Previous algorithms for this learning problem incurred sample complexity scaling polynomially with the bit complexity, even though such a dependence is not information-theoretically necessary.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.46)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Shapeshifter: a Parameter-efficient Transformer using Factorized Reshaped Matrices

Neural Information Processing SystemsApr-24-2026, 14:56:38 GMT

Language models employ a very large number of trainable parameters. Despite being highly overparameterized, these networks often achieve good out-of-sample test performance on the original task and easily fine-tune to related tasks. Recent observations involving, for example, intrinsic dimension of the objective landscape and the lottery ticket hypothesis, indicate that often training actively involves only a small fraction of the parameter space. Thus, a question remains how large a parameter space needs to be in the first place -- the evidence from recent work on model compression, parameter sharing, factorized representations, and knowledge distillation increasingly shows that models can be made much smaller and still perform well. Here, we focus on factorized representations of matrices that underpin dense, embedding, and self-attention layers. We use low-rank factorized representation of a reshaped and rearranged original matrix to achieve space efficient and expressive linear layers. We prove that stacking such low-rank layers increases their expressiveness, providing theoretical understanding for their effectiveness in deep networks. In Transformer models, our approach leads to more than tenfold reduction in the number of total trainable parameters, including embedding, attention, and feed-forward layers, with little degradation in on-task performance. The approach operates out-of-the-box, replacing each parameter matrix with its compact equivalent while maintaining the architecture of the network.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.46)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions

Neural Information Processing SystemsMar-22-2026, 12:45:13 GMT

We propose a method for metric-scale monocular depth estimation. Inferring depth from a single image is an ill-posed problem due to the loss of scale from perspective projection during the image formation process. Any scale chosen is a bias, typically stemming from training on a dataset; hence, existing works have instead opted to use relative (normalized, inverse) depth. Our goal is to recover metric-scaled depth maps through a linear transformation. The crux of our method lies in the observation that certain objects (e.g., cars, trees, street signs) are typically found or associated with certain types of scenes (e.g., outdoor).

artificial intelligence, name change, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.55)

Add feedback

e59783120660eb4e847550d4584e74d6-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 12:14:17 GMT

artificial intelligence, machine learning, transformation, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Jiangsu Province > Changzhou (0.04)
Asia > China > Fujian Province > Xiamen (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

Add feedback

ca70528fb11dc8086c6a623da9f3fee6-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 04:00:31 GMT

intervention, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(6 more...)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Figure 9: In experiments, we used a common feature-extractor (F

Neural Information Processing SystemsFeb-17-2026, 10:29:33 GMT

Here, we include implementation details omitted from the main paper for brevity. Upon acceptance, a deanonymized repository will be released. The last layer's dimension depended upon the exact The feature extractors and decoders varied by domain. In particular, we found that if we did not apply this linear transformation (i.e., pass the raw encodings For VQ-based methods, use a large enough codebook to have at least one element per class. Other differences simply reflected differences in architecture (e.g., For iNat, we trained all models with batch size 256, using the hyperparameters specified in Table 3.

artificial intelligence, machine learning, representation, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Properties of Kullback-Leibler Divergence Between Multivariate Gaussian Distributions

Neural Information Processing SystemsFeb-16-2026, 16:47:52 GMT

Kullback-Leibler (KL) divergence is one of the most important measures to calculate the difference between probability distributions. In this paper, we theoretically study several properties of KL divergence between multivariate Gaussian distributions.

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Hunan Province > Changsha (0.04)
North America > United States > California (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback